!pr2
A Wildcard Filename Search.................Bob Sander-Cederlof

Over the years I have fallen into certain habits when it comes to naming files.  I find it convenient to use names starting with "S." for assembly language source files, "B." for binary object code files, and so on.  Others like to use suffixes like ".SRC" and ".OBJ" for the same reasons.  Some operating systems, like CP/M for example, use suffixes to indicate file type.  Others, like ProDOS, let you build sub-directories to categorize your files.

Sometimes I would like to have the ability to do the same operation on a whole group of files.  For example, I might want to DELETE all files starting with "B.".  Or I might want to copy a whole group of files from one disk to another.  If the files happen to have similar names, and if DOS allowed wildcards in filenames, it would be easier.

Some DOS 3.3 programs do have this feature:  Apple's FID program, Sensible Software's Super Disk Copy, and others.  They have a method for specifying a filename without spelling out the entire name.

The subroutine inside DOS 3.3 which compares a filename you have specified with the names in a catalog is found at $B1F5:

       LDY #0
       INX
       INX
 .1    INX
       LDA ($42),Y  Filename you specified
       CMP $B4C6,X  Filename in catalog sector
       BNE ...      ...did not match
       INY
       CPY #30
       BNE .1
   ... matched ...

This is a very straightforward string comparison.  It requires an exact match of all 30 characters of a filename.  There is a similar routine at $A782 which compares a filename you specify with the filenames in the open file buffers.

I wrote a subroutine called MATCH which compares two 30-character strings, allowing wildcards.  Unfortunately, it not a simple matter to plug such a subroutine into DOS 3.3, and I have not done that.  It is more likely that this subroutine will find its way into some future utility programs.

I also wrote a testing program, so that I could see if my code worked.  The program in lines 1110-1380 searches through a list of 30-character strings, printing those which match a key string.  To simplify my test program (a good idea to keep testers simple, so they are not themselves more buggy than the testees!) I assembled in the key string and the list of strings to be searched.  A slightly better test would allow me to type in the key string.

My MATCH program assumes that the address of the string to be compared with the key is stored at FN and FN+1.  Characters in the filename are addressed by "(FN),Y", and in the key are addressed by "KEY,X".  MATCH will return with carry set if the filename matches the key, and carry clear if not.

Both the filename and the key are stored "left-justified, blank-filled".  That means there may be any number of non-significant blanks on the right end.  Lines 1490-1530 scan the current filename from right-to-left, looking for the last non-blank in the name.  Lines 1550-1590 do the same for the key.  If there is any chance either filename or key could be completely blank, an extra line "BMI ERROR" should be inserted at 1505 and 1565.

I save the index to the right end of the key in KEY.START.  Because the end of the filename and key strings is variable, I actually do the comparison from right to left.  This makes the "end" actually the beginning.

Line 1610 could be "JMP .4" or "BNE .4", because the object is to get to line 1660.  However, the "INX" allows me to fall through lines 1630-1640 and it takes only one byte rather than two or three.

The comparison begins at line 1660.  Remember we are scanning backwards, from right to left.  Lines 1660-1670 save the two string pointers.  Line 1680 gets the next character from the key.  If it is a wildcard, I branch back to line 1630.  Note that all that happens is that the wildcard is skipped over!

If the key character is not a wildcard, it gets compared to the next character of the filename at line 1710.  If it matches, lins 1730-1760 advance both pointers and the comparison continues.  These lines also check to see if we have come to the left end of the key or of the filename.

If we are at the end of the filename, lines 1770-1820 check the rest of the key.  If there are any characters left in the key which are not wildcards, then the current filename does not match.  Otherwise, it does match.  Lines 1830-1880 set the appropriate carry status and return.

If we are at the end of the key, lines 1900-1910 check whether we are also at the end of the filename.  If so, the filename matched.  If not, maybe it did not match.  I say maybe, because if there was a wildcard, we might come out with a match if we widen the amount matched by that wildcard.  Lines 1920-1990 will handle that possibility.

Two conditions bring us to line 1930.  Either a character in the key did not match the current character in the filename, or there are unmatched filename characters left over after the end of the key.  In either case, if there has been no wildcard in the key (so far), then the filename does not match the key.  If there has been a wildcard, we can try again to match from the most recent wildcard on.  We can tell whether or not there has been a wildcard so far by comparing KEY.PNTR with KEY.START.  If they are the same, there has been no wildcard.  Lines 1920-1990 handle all these details.

I made the wild card character itself a variable, so that you could change it by program control.  Since "=" is a valid character in a filename, you might want to use something else.

With this kind of MATCH subroutine, a key of "=.OBJ" would match all names ending with ".OBJ"; "S.=" would match all names starting with "S."; "=A=B=" would match all names containing "A" followed by "B".

You can see the similarity between MATCH and a global search capability such as you might find in a word processor, or in the S-C Macro Assembler.  The FIND and REPLACE commands in S-C Macro allow wildcards.  However, MATCH differs in that it anchors the key to the beginning and end of the file name (unless you specify a wildcard in those positions).

If string comparisons of this type intrigue you, the book "Software Tools" develops an extremely powerful one in chapter 5.  "Software Tools" is a classic book by Kernighan and Plauger, available at many bookstores.  (A "classic" in computer books is one still in print after five years; this one qualifies, since it was originally published in 1976.)  Their string match routine allows single- and multi-character wildcards, semi-wildcards that match subsets of characters, control of anchoring, and more.  It would be a worthwhile exercise to try implementing their algorithm in 6502 language.
